Executive Summary

This analysis seeks two objectives:

  • (1) predict production levels for the next four quarters via a time series analysis.
  • (2) predict whether a respondent is or is not in the 1975 labor market via machine learning classifciation models trained on a number of economic and demographic features.

First, this analysis provides the following predictions for production levels over the next four quarters (in our dataset, 2010 Q3 – 2011 Q2) (powered by an ARIMA model and Facebook's Prophet):

all-models

facebook-prophet-and-ARIMA-cropped

  • Credit is given to Jason Brownlee for incredible explanations of Time Series ML.

Second, this analysis correctly picked which respondent would be in the workforce with 80.79% accuracy (74.6% on cross-validation).

The winning model was a Logistic Regression Classification, followed closely by an XGBoost Classification, followed by a Keras Neural Network Classifier (powered by Google’s TensorFlow).

From a predictive standpoint, the logistic regression classifier found the following features to be most impactful in predicting whether a respondent (in this study, a heterosexual married woman) was in the 1975 labor force. The following is in order of biggest log-odds impact:

  • federal marginal tax rate facing woman (proxy for wealth)
    • increase here = lower labor force likelihood
      space
  • husband's hourly wage, 1975
    • increase here = lower labor force likelihood
      space
  • hours worked by husband, 1975
    • increase here = lower labor force likelihood
      space
  • number of kids under 6 years old
    • increase here = lower labor force likelihood
      space
  • age
    • increase here = lower labor force likelihood
      space
  • years of experience
    • increase here = higher labor force likelihood

From a decion-tree splitting perspective, XGBoost found the following features to be most important (though these numbers have no postive or negative 'sign,' so they do not indicate directionality). The following is in descending order of importantance (most important first):

  • years of experience
  • number of kids under 6 years old
  • number of kids between 6 and 18 years old
  • federal marginal tax rate facing woman (proxy for wealth)
  • mother's years of schooling
  • husband's age
  • mother's age
  • years of schooling

Credit to God, my Mother, family and friends.

All errors are my own.

Best,
George John Jordan Thomas Aquinas Hayward, Optimist

Selected Data Visualizations

Time Series

Time Series Autocorrelation Graph

autocorrelation

ARIMA Prediction

arima-prediction

Facebook Prophet Prediction

fb-prophet-prediction

ARIMA vs. FB Prophet Prediction

all-models

ARIMA vs. FB Prophet Prediction Numbers

facebook-prophet-and-ARIMA-cropped

Classification

Feature Pair Plot

labor-data-pair-plot-lower-res

Feature Correlation Matrix

labor-data-correlation-matrix

Logistic Regression Coefficient Interpretation

logistic-regression-coefs

XGBoost Feature Importance

laborforce-xgbfeatures

Logistic Regression Confustion Matrix

exhibit-g-logistic-regression-confusion-matrix

Key Assumptions & Plan of Attack

Time Series

Classification

  • I opted to use three different kinds of regression classification techniques:
  • Features were generally continuous, and the data was generally in great shape.
  • Upon future iterations of this analysis, more features could be engineered (sort of like the experience ^ 2 feature).
  • I eliminated features that were not predictive because they were not truly independent from the dependent variable:
    • For instance, ‘wage’ and ‘lwage’ must be 0 for those outside the labor force and must be greater than 0 for those in the workforce. As such these features are not really predictive.
    • Similarly, ‘faminc’ and 'nwifeinc' both take into account the wage of respondent (the woman in the household), so it also gives away whether the respondent is or is not in the workforce, I've also eliminated it.
      • In other words, it includes, though not perfectly, the information contained in the ‘wage’ feature, which will taint the predictive power of the model.
    • Finally, 'Repwage' is the wage reported in 1976, but since we're predicting for being in the labor force in 1975, then it should be taken out.
  • All models were cross-validated, and the final cross validation accuracies were as follows:
    • Logistic Regression Classification: 74.6%
    • XGBoost Classification: 71.3%
    • Keras Neural Network: 71.3%

Part 0. Load in Dependencies

  • Will load in dependencies for the ARIMA model, Facebook Prophet, XGBoost Classification, Logistic Regression Classification, and Keras Neural Network Classification Model (with Google's TensorFlow). Sk-learn Grid Search will also be loaded, along with Matplotlib.
In [93]:
#for time series work
import altair as alt
alt.renderers.enable('notebook')
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'retina'
import pandas as pd
pd.set_option('display.max_columns', None)
from pandas import DataFrame
import numpy as np
from pandas.plotting import autocorrelation_plot
from statsmodels.tsa.arima_model import ARIMA
from math import sqrt
from sklearn.metrics import mean_squared_error
from fbprophet import Prophet
import warnings
warnings.filterwarnings("ignore") #just for the final
#for regression classification work
import seaborn as sns; sns.set()
import missingno as msno
from scipy import stats
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn import linear_model
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, r2_score, median_absolute_error, \
explained_variance_score, confusion_matrix, accuracy_score, precision_score, recall_score
import xgboost as xgb
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold

Part I. Time Series Analysis

  • Plan to use ARIMA model and Facebook Prohpet Model.

A.) Forecasting

Analyze the “production” time series data in the provided file and choose a forecasting model that provides reasonable forecasts at a 1-4 quarter horizon. In addition to including and showing (through code output, visuals, or both) the selected forecasting model, please include descriptions of the following:

  • How did you decide on the specific forecasting model? What tests or plots did you use when considering various forecasting approaches?
  • How accurate is your model? How did you test its performance?

Please note: this exercise is meant to get a general sense of how you think about forecasting problems - it isn’t intended for you to find the “global optimal” forecast methodology for the provided data series. Accordingly, please limit your total time spent on the exercise to approximately one hour.</font>

In [2]:
#first, we load in the data and set up an ARIMA focused dataframe
df_arima = pd.read_csv('data_A.csv')
#we need to get the string quater naming convention into a datetime friendly format
#I've opted to use January 1 to represent the first quarter, April 1 for the second, July 1 for the third, 
#and October 1 for the fourth...this ensures our data points have the correct 3-month interval cadence
df_arima.time = df_arima.time.str.replace(' Q1', '-01-01', regex=False)
df_arima.time = df_arima.time.str.replace(' Q2', '-04-01', regex=False)
df_arima.time = df_arima.time.str.replace(' Q3', '-07-01', regex=False)
df_arima.time = df_arima.time.str.replace(' Q4', '-10-01', regex=False)
df_arima['time'] = pd.to_datetime(df_arima['time'],format='%Y-%m-%d')
df_arima.head()
Out[2]:
time production
0 1956-01-01 284
1 1956-04-01 213
2 1956-07-01 227
3 1956-10-01 308
4 1957-01-01 262
In [3]:
#now plotting the same thing more properly
plt.plot(df_arima.time, df_arima.production)
plt.show()

👆🏽Thoughts about this? 👆🏽

  • We can see some seasonality as the curve goes up and down with some regularity.
  • We do not have a stationary graph, because you can observe a trend from 1960 to 1975, and, again, another trend from 1980 to 2010.
In [103]:
#let's check the autocorrelation to see how many periods back our autoregression should track
autocorrelation_plot(df_arima.production)
plt.title("Autocorrelation Graph", fontweight = 'bold')
plt.savefig('autocorrelation.png',dpi=300, bbox_inches='tight')
plt.show()

👆🏽Thoughts about this? 👆🏽

  • We see a significant correlation up to around 25 periods, and an extremely high, positive correlation when we look at the first 5 periods.
    • This makes sense to me because there are only 4 quarters in a year, and I would think that the last year (so last 4 periods) would be perhaps the most helpful when thinking about the next four quarters.
In [12]:
X = df_arima.production.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions_arima = []
In [13]:
# walk-forward validation
for t in range(len(test)):
    model = ARIMA(history, order=(6,1,1)) 
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions_arima.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))
predicted=583.361869, expected=574.000000
predicted=469.884850, expected=443.000000
predicted=422.122631, expected=410.000000
predicted=428.541688, expected=420.000000
predicted=563.170100, expected=532.000000
predicted=439.575125, expected=433.000000
predicted=392.686590, expected=421.000000
predicted=409.343245, expected=410.000000
predicted=532.900785, expected=512.000000
predicted=439.102902, expected=449.000000
predicted=410.926546, expected=381.000000
predicted=415.435590, expected=423.000000
predicted=505.971137, expected=531.000000
predicted=437.016598, expected=426.000000
predicted=397.945416, expected=408.000000
predicted=426.781951, expected=416.000000
predicted=526.411806, expected=520.000000
predicted=441.402115, expected=409.000000
predicted=400.478497, expected=398.000000
predicted=415.083399, expected=398.000000
predicted=509.045246, expected=507.000000
predicted=405.230612, expected=432.000000
predicted=387.455530, expected=398.000000
predicted=406.855965, expected=406.000000
predicted=514.267067, expected=526.000000
predicted=434.329681, expected=428.000000
predicted=407.837323, expected=397.000000
predicted=417.402220, expected=403.000000
predicted=520.466045, expected=517.000000
predicted=431.045554, expected=435.000000
predicted=393.860797, expected=383.000000
predicted=407.807633, expected=424.000000
predicted=514.185726, expected=521.000000
predicted=436.695440, expected=421.000000
predicted=395.823406, expected=402.000000
predicted=423.849517, expected=414.000000
predicted=519.588853, expected=500.000000
predicted=430.609009, expected=451.000000
predicted=390.220780, expected=380.000000
predicted=420.725200, expected=416.000000
predicted=507.246522, expected=492.000000
predicted=441.080057, expected=428.000000
predicted=384.992543, expected=408.000000
predicted=408.082253, expected=406.000000
predicted=491.043855, expected=506.000000
predicted=440.463537, expected=435.000000
predicted=404.738565, expected=380.000000
predicted=422.091693, expected=421.000000
predicted=496.871507, expected=490.000000
predicted=430.317010, expected=435.000000
predicted=387.047701, expected=390.000000
predicted=414.949080, expected=412.000000
predicted=495.536671, expected=454.000000
predicted=438.672501, expected=416.000000
predicted=378.230516, expected=403.000000
predicted=396.894481, expected=408.000000
predicted=453.905157, expected=482.000000
predicted=422.256805, expected=438.000000
predicted=407.626210, expected=386.000000
predicted=429.602944, expected=405.000000
predicted=481.401355, expected=491.000000
predicted=430.930299, expected=427.000000
predicted=391.689579, expected=383.000000
predicted=409.369527, expected=394.000000
predicted=484.187923, expected=473.000000
predicted=427.008078, expected=420.000000
predicted=375.742417, expected=390.000000
predicted=390.939274, expected=410.000000
predicted=472.662500, expected=488.000000
predicted=428.574274, expected=415.000000
predicted=399.863735, expected=398.000000
predicted=416.609365, expected=419.000000
predicted=485.179996, expected=488.000000
predicted=423.665198, expected=414.000000
predicted=401.149464, expected=374.000000
In [14]:
#let's fit the ARIMA model
#the paramters used here are the reult of a grid search I ran in previous analysis
#model = ARIMA(df_arima.production, order=(6,1,1))
#model_fit = model.fit(disp=0)
In [15]:
print(model_fit.summary())
                             ARIMA Model Results                              
==============================================================================
Dep. Variable:                    D.y   No. Observations:                  216
Model:                 ARIMA(6, 1, 1)   Log Likelihood                -917.664
Method:                       css-mle   S.D. of innovations             16.626
Date:                Sat, 31 Aug 2019   AIC                           1853.327
Time:                        23:30:02   BIC                           1883.705
Sample:                             1   HQIC                          1865.600
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.7911      0.555      1.425      0.156      -0.297       1.879
ar.L1.D.y     -1.3591      0.142     -9.544      0.000      -1.638      -1.080
ar.L2.D.y     -1.1343      0.160     -7.088      0.000      -1.448      -0.821
ar.L3.D.y     -0.6633      0.169     -3.927      0.000      -0.994      -0.332
ar.L4.D.y      0.3042      0.164      1.860      0.064      -0.016       0.625
ar.L5.D.y      0.6690      0.101      6.592      0.000       0.470       0.868
ar.L6.D.y      0.4357      0.062      6.974      0.000       0.313       0.558
ma.L1.D.y      0.3537      0.157      2.246      0.026       0.045       0.662
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1            1.3435           -0.0000j            1.3435           -0.0000
AR.2           -0.0020           -1.0062j            1.0062           -0.2503
AR.3           -0.0020           +1.0062j            1.0062            0.2503
AR.4           -1.0252           -0.0000j            1.0252           -0.5000
AR.5           -0.9250           -0.8890j            1.2829           -0.3782
AR.6           -0.9250           +0.8890j            1.2829            0.3782
MA.1           -2.8273           +0.0000j            2.8273            0.5000
-----------------------------------------------------------------------------
In [16]:
#we check to see if there are any odd pattens in the residuals over the time series
residuals = DataFrame(model_fit.resid) 
residuals.plot()
plt.show()
In [17]:
#we continue to check residuals
residuals.plot(kind='kde') 
plt.show()
In [18]:
rmse = sqrt(mean_squared_error(test, predictions_arima)) 
print('Test RMSE: %.3f' % rmse)
Test RMSE: 15.996
In [20]:
#time to gridsearch for the best ARIMA model #this code comes from Jason Brownlee
# evaluate an ARIMA model for a given order (p,d,q)
def evaluate_arima_model(X, arima_order):
    # prepare training dataset
    train_size = int(len(X) * 0.66)
    train, test = X[0:train_size], X[train_size:]
    history = [x for x in train]
    # make predictions
    predictions_eval = []
    for t in range(len(test)):
        model = ARIMA(history, order=arima_order)
        model_fit = model.fit(disp=0)
        yhat = model_fit.forecast()[0]
        predictions_eval.append(yhat)
        history.append(test[t])
    # calculate out of sample error
    rmse = sqrt(mean_squared_error(test, predictions_eval))
    return rmse

# evaluate combinations of p, d and q values for an ARIMA model
def evaluate_models(dataset, p_values, d_values, q_values):
    dataset = dataset.astype('float32')
    best_score, best_cfg = float("inf"), None
    for p in p_values:
        for d in d_values:
            for q in q_values:
                order = (p,d,q)
                try:
                    rmse = evaluate_arima_model(dataset, order)
                    if rmse < best_score:
                        best_score, best_cfg = rmse, order
                    print('ARIMA%s RMSE=%.3f' % (order,rmse))
                except:
                    continue
    print('Best ARIMA%s RMSE=%.3f' % (best_cfg, best_score))

# evaluate parameters
p_values = [3,4,5,6,7]
d_values = range(0, 3)
q_values = range(0, 3)
evaluate_models(df_arima.production.values, p_values, d_values, q_values)
ARIMA(3, 0, 0) RMSE=50.645
ARIMA(3, 0, 1) RMSE=35.438
ARIMA(3, 0, 2) RMSE=26.444
ARIMA(3, 1, 0) RMSE=17.807
ARIMA(3, 1, 1) RMSE=17.850
ARIMA(3, 2, 0) RMSE=24.093
ARIMA(3, 2, 2) RMSE=16.793
ARIMA(4, 0, 1) RMSE=16.417
ARIMA(4, 1, 0) RMSE=17.877
ARIMA(4, 1, 1) RMSE=16.924
ARIMA(4, 2, 1) RMSE=16.877
ARIMA(4, 2, 2) RMSE=16.893
ARIMA(5, 0, 0) RMSE=16.427
ARIMA(5, 1, 0) RMSE=17.606
ARIMA(5, 1, 1) RMSE=17.024
ARIMA(5, 2, 1) RMSE=64.289
ARIMA(6, 0, 0) RMSE=17.086
ARIMA(6, 1, 0) RMSE=16.269
ARIMA(6, 1, 1) RMSE=15.996
ARIMA(7, 0, 0) RMSE=16.105
Best ARIMA(6, 1, 1) RMSE=15.996
In [24]:
#4 more predictions
# walk-forward validation
predictions_four_more_arima = []
for t in range(4):
    model = ARIMA(history, order=(6,1,1)) 
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions_four_more_arima.append(yhat)
    history.append(yhat)
    print('predicted=%f' % (yhat))
predicted=419.813278
predicted=480.280402
predicted=410.967236
predicted=379.157988
In [25]:
#key for us is what are the next four quarters going to be like?
for i in range(4):
    print(predictions_four_more_arima[i])
[419.81327788]
[480.28040207]
[410.96723569]
[379.15798845]
In [26]:
#we can add those four predictions to a dataframe, and concatenate it to the original
pred_array_arima = {'time': ['2010-07-01', '2010-10-01', '2011-01-01', '2011-04-01'], 'production': [428.72285814,\
                                                        481.60032708,410.96723569,379.15798845]}
pred_df_arima = pd.DataFrame(data=pred_array_arima)
pred_df_arima
Out[26]:
time production
0 2010-07-01 428.722858
1 2010-10-01 481.600327
2 2011-01-01 410.967236
3 2011-04-01 379.157988
In [29]:
combined_future_arima = pd.concat([df_arima,pred_df_arima], axis=0, ignore_index=True)
combined_future_arima['time'] = pd.to_datetime(combined_future_arima['time'],format='%Y-%m-%d')
In [57]:
plt.plot(df_arima.time, df_arima.production, label="Actual")
plt.plot(combined_future_arima.time[-5:], combined_future_arima.production[-5:], label = 'ARIMA Predicted')
plt.suptitle("Production Historicals & Predictions: \n Time Series Analysis (ARIMA)",\
             fontsize = 12, fontweight = 'bold')
plt.xlabel("Year")
plt.ylabel("Production")
#plt.xticks(np.arange(20, 220, step=38), ('1960', '1970', '1980', '1990', '2000','2010'))
plt.legend()
plt.savefig('arima_prediction.png',dpi=300, bbox_inches='tight')
plt.show()
In [33]:
df_prophet = pd.read_csv('data_A.csv')
df_prophet.time = df_prophet.time.str.replace(' Q1', '-01-01', regex=False)
df_prophet.time = df_prophet.time.str.replace(' Q2', '-04-01', regex=False)
df_prophet.time = df_prophet.time.str.replace(' Q3', '-07-01', regex=False)
df_prophet.time = df_prophet.time.str.replace(' Q4', '-10-01', regex=False)
#need to get the format and naming conventions correct for Facebook Prophet
df_prophet['time'] = pd.to_datetime(df_prophet['time'],format='%Y-%m-%d')
df_prophet = df_prophet.rename(columns={"time": "ds", "production": "y"})
df_prophet.head()
Out[33]:
ds y
0 1956-01-01 284
1 1956-04-01 213
2 1956-07-01 227
3 1956-10-01 308
4 1957-01-01 262
In [40]:
m = Prophet(weekly_seasonality=False, daily_seasonality=False)
m.fit(df_prophet)
Out[40]:
<fbprophet.forecaster.Prophet at 0x1c1f1d3c50>
In [41]:
future = m.make_future_dataframe(periods=365)
In [42]:
forecast = m.predict(future)
forecast[['ds', 'yhat']].tail()
Out[42]:
ds yhat
578 2011-03-28 369.188152
579 2011-03-29 369.237345
580 2011-03-30 369.676611
581 2011-03-31 370.495194
582 2011-04-01 371.681714
In [43]:
prophet_predictor_times = ['2010-07-01', '2010-10-01', '2011-01-01', '2011-04-01']
In [45]:
dates = []
prophet_predictions = []
for i in prophet_predictor_times:
    dates.append(i)
    prophet_predictions.append(forecast.yhat[forecast.ds == i].values)
In [46]:
for i in prophet_predictions:
    print(i)
[387.42722424]
[485.30002099]
[417.43768147]
[371.68171404]
In [47]:
pred_array_prophet = {'ds': ['2010-07-01', '2010-10-01', '2011-01-01', '2011-04-01'], 'y': [387.42722424,\
                                                        485.30002099,417.43768147,371.68171404]}
pred_df_prophet  = pd.DataFrame(data=pred_array_prophet )
pred_df_prophet 
Out[47]:
ds y
0 2010-07-01 387.427224
1 2010-10-01 485.300021
2 2011-01-01 417.437681
3 2011-04-01 371.681714
In [48]:
combined_future_prophet = pd.concat([df_prophet,pred_df_prophet], axis=0, ignore_index=True)
combined_future_prophet['ds'] = pd.to_datetime(combined_future_prophet['ds'],format='%Y-%m-%d')
In [58]:
plt.plot(df_prophet.ds, df_prophet.y, label="Actual")
plt.plot(combined_future_prophet.ds[-5:], combined_future_prophet.y[-5:], label = 'FB Prophet Predicted')
plt.suptitle("Production Historicals & Predictions: \n Time Series Analysis (Facebook Prophet)",\
             fontsize = 12, fontweight = 'bold')
plt.xlabel("Year")
plt.ylabel("Production")
plt.legend()
plt.savefig('fb_prophet_prediction.png',dpi=300, bbox_inches='tight')
plt.show()
In [59]:
plt.plot(df_prophet.ds, df_prophet.y, label="Actual")
plt.plot(combined_future_prophet.ds[-5:], combined_future_prophet.y[-5:], label = 'FB Prophet Predicted')
plt.plot(combined_future_arima.time[-5:], combined_future_arima.production[-5:], label = 'ARIMA Predicted')
plt.suptitle("Production Historicals and Predictions \n Time Series Analysis (ARIMA & Facebook Prophet)",\
             fontsize = 12, fontweight = 'bold')
plt.xlabel("Year")
plt.ylabel("Production")
plt.legend()
plt.savefig('all_models.png',dpi=300, bbox_inches='tight')
plt.show()
In [60]:
#we can add those four predictions to a dataframe, and concatenate it to the original
pred_array_arima_and_prophet = {'Quarter': ['2010-07-01', '2010-10-01', '2011-01-01', '2011-04-01'],\
                            'ARIMA Predictions': [428.72285814, 481.60032708, 410.96723569, 379.15798845],\
                            'FB Prophet Predictions': [387.42722424,485.30002099,417.43768147,371.68171404]
                               }
pred_df_arima_prophet = pd.DataFrame(data=pred_array_arima_and_prophet)
pred_df_arima_prophet
Out[60]:
Quarter ARIMA Predictions FB Prophet Predictions
0 2010-07-01 428.722858 387.427224
1 2010-10-01 481.600327 485.300021
2 2011-01-01 410.967236 417.437681
3 2011-04-01 379.157988 371.681714

Part II. Classification Analysis

  • Plan to a logistic regression classification, a XGBoost classification, and a Keras Neural Network classification.

B.) Regression/Machine Learning

Use the data provided to create a model that predicts labor force participation (inlf variable in the dataset).
You are free to use any combination of the other variables for this prediction.
Here is a list of the included variables and their descriptions.

regression-Data

In addition to describing the selected model, please describe how you chose the model and how you tested its effectiveness in predicting the variable of interest.</font>

In [61]:
labordata = pd.read_csv("data_B.csv")
In [62]:
labordata.head()
Out[62]:
inlf hours kidslt6 kidsge6 age educ wage repwage hushrs husage huseduc huswage faminc mtr motheduc fatheduc unem city exper nwifeinc lwage expersq
0 1 1610 1 0 32 12 3.3540 2.65 2708 34 12 4.0288 16310 0.7215 12 7 5.0 0 14 10.910060 1.210154 196
1 1 1656 0 2 30 12 1.3889 2.65 2310 30 9 8.4416 21800 0.6615 7 7 11.0 1 5 19.499981 0.328512 25
2 1 1980 1 3 35 12 4.5455 4.04 3072 40 12 3.5807 21040 0.6915 12 7 5.0 0 15 12.039910 1.514138 225
3 1 456 0 3 34 12 1.0965 3.25 1920 53 10 3.5417 7300 0.7815 7 7 5.0 0 6 6.799996 0.092123 36
4 1 1568 1 2 31 14 4.5918 3.60 2000 32 12 10.0000 27300 0.6215 12 14 9.5 1 7 20.100060 1.524272 49
In [63]:
msno.matrix(labordata,  color = (.0,.0,.2))
Out[63]:
<matplotlib.axes._subplots.AxesSubplot at 0x1c1e0d0278>

👆🏽Thoughts about this? 👆🏽

  • Using this null data visualization, we can immediately see that there is something up with the 'wage' and 'lwage' columns.
  • This makes sense:
    • If you are making a wage or thus a log-transformed wage, then you must have been working in 1975. Thus, this data point is not actually predictive, and I'll soon take it out of the dataset.
In [64]:
#missing data
total = labordata.isnull().sum().sort_values(ascending=False)
percent = (labordata.isnull().sum()/labordata.isnull().count()).sort_values(ascending=False)
missing_data = pd.concat([total, percent], axis=1, keys=['Total', 'Percent'])
missing_data.head()
Out[64]:
Total Percent
lwage 325 0.431607
wage 325 0.431607
expersq 0 0.000000
hours 0 0.000000
kidslt6 0 0.000000
In [65]:
#dropping features we do not want in the analysis
labordata_processed = labordata.drop(['lwage','wage', 'repwage','hours',\
                                     'faminc','nwifeinc'], axis = 1)
#each of these feautres is basically a proxy for already being in the workforce, so they aren't predictors
#repwage happens in 1976, but we're predicting for 1975 so that has to be out
#family income and nwifeinc are backdoors for the income of the woman, who would then already be in the workforce

👆🏽Thoughts about this? 👆🏽

  • Each of these features is basically a proxy for already being in the workforce, so they aren't predictors.
  • Along those lines, 'faminc' and 'nwifeinc' both take into account the wage of the person being studied, so it also gives away whether the respondent is or is not in the workforce, I've also eliminated it.
  • 'Repwage' is the wage reported in 1976, but since we're predicting for being in the labor force in 1975, then it should be taken out.
In [66]:
msno.matrix(labordata_processed,  color = (.0,.0,.2))
Out[66]:
<matplotlib.axes._subplots.AxesSubplot at 0x1c206285f8>
In [67]:
#all credit due to: Pedro Marcelino 
#https://www.kaggle.com/pmarcelino/comprehensive-data-exploration-with-python
#correlation matrix
corrmat = labordata_processed.corr()
f, ax = plt.subplots(figsize=(12, 9))
sns.heatmap(corrmat, vmax=.8, square=True)
plt.savefig('labor_data_correlation_matrix.png',dpi=300, bbox_inches='tight')

👆🏽Thoughts about this? 👆🏽

  • Correlation plots are great for seeing which kinds of features 'move' with each other.
  • One interesting thing to see is the correlation between age and husband's age.
    • Looks like people of around the same age get married together.
In [68]:
#all credit due to: Pedro Marcelino 
#https://www.kaggle.com/pmarcelino/comprehensive-data-exploration-with-python
#scatterplot
sns.set()
#cols = ['column1', 'column2']
sns.pairplot(labordata_processed, size = 2.5)
#plt.savefig('exhibit_0b_pair_plot.png',dpi=600, bbox_inches='tight')
#600 dpi is commented out. It's useful for viewing all the data up close on your desktop, but slows down the script.
plt.savefig('labor_data_pair_plot_lower_res.png',dpi=150, bbox_inches='tight')
plt.show()

👆🏽Thoughts about this? 👆🏽

  • Pair plots are great for quickly scanning all the data and seeing realtionships.
    • We must remember that they don't take into account interactions.
In [69]:
#address skew in features
#For this block, credit goes to Alexandru Papiu 
#(https://www.kaggle.com/apapiu/regularized-linear-models)
#log transform skewed numeric features:
continous_features_classification = ['kidslt6', 'kidsge6', 'age', 'educ', 'hushrs', 'husage',\
'huseduc', 'huswage', 'mtr', 'motheduc', 'fatheduc', 'unem', 'exper', 'expersq'] 
#exposure units have already been taken out
skewed_feats = labordata_processed[continous_features_classification].apply(\
                                                                                lambda x: stats.skew(x)) 
#compute skewness
skewed_feats = skewed_feats[skewed_feats > 0.75]
skewed_feats = skewed_feats.index

labordata_processed[skewed_feats] = np.log1p(labordata_processed[skewed_feats])
In [70]:
#xgboost
classification_features_xgb = labordata_processed.drop(['inlf'], axis = 1)
classification_outcome_xgb = labordata_processed.inlf
train_features, test_features, train_labels, test_labels = train_test_split(classification_features_xgb,\
                                                                    classification_outcome_xgb, test_size = 0.2)

xgb_classy = xgb.XGBClassifier()

xgb_classy.fit(train_features,train_labels)

xgb_classy_predictions = xgb_classy.predict(test_features)


#run model
print("_________XGBoost Regression Classification_________")
print("")
print("Scored Against Itself")
print('Accuracy Score: {}'.format(round(xgb_classy.score(train_features, train_labels),3)))
print("")
print("Scored Against Test Data")
print('Accuracy Score: {}'.format(round(xgb_classy.score(test_features, test_labels),3)))
_________XGBoost Regression Classification_________

Scored Against Itself
Accuracy Score: 0.887

Scored Against Test Data
Accuracy Score: 0.748
In [71]:
#cross-val #this will run about 70 seconds per print
print("_________Cross-Validation Scoring for XGBoost Classification_________")
print('Accuracy: {}'.format(round(cross_val_score(xgb_classy, train_features, train_labels, \
                                                                cv=10, scoring='accuracy').mean(),3)))
print('Precision: {}'.format(round(cross_val_score(xgb_classy, train_features, train_labels, \
                                                                cv=10, scoring='precision').mean(),3)))
print('Recall: {}'.format(round(cross_val_score(xgb_classy, train_features, train_labels, \
                                                                cv=10, scoring='recall').mean(),3)))
_________Cross-Validation Scoring for XGBoost Classification_________
Accuracy: 0.713
Precision: 0.739
Recall: 0.769
In [72]:
top_10_xgb_features = pd.DataFrame(sorted(list(zip(classification_features_xgb,xgb_classy.feature_importances_))\
       ,key = lambda x: abs(x[1]),reverse=True)[:10], columns=['Feature', 'XGBoost Importance'])
top_10_xgb_features
Out[72]:
Feature XGBoost Importance
0 exper 0.176079
1 kidslt6 0.081903
2 kidsge6 0.079420
3 mtr 0.079354
4 motheduc 0.075405
5 husage 0.074865
6 age 0.073868
7 educ 0.072686
8 huswage 0.063095
9 hushrs 0.051104
In [73]:
#plt.xticks(rotation=-25)
bar_count = range(len(top_10_xgb_features.Feature))
fig, axs = plt.subplots(ncols=2, figsize=(14,4))
#using a subplot method coupled with an inline parameter to have high resolution
#note: "[::-1]" reverses the column in a pandas dataframe
axs[1].set_axis_off()
axs[0].barh(bar_count, top_10_xgb_features['XGBoost Importance'][::-1],\
                 align='center', alpha=1)
axs[0].set_xlabel('Values')
axs[0].set_yticks(bar_count)
axs[0].set_yticklabels(top_10_xgb_features.Feature[::-1], fontsize=10)
axs[0].set_xlabel('XGBoost Importance')
axs[0].set_title("XGBoost's Feature Importances",fontweight = 'bold')

extent = axs[0].get_window_extent().transformed(fig.dpi_scale_trans.inverted())
fig.savefig('laborforce_xgbfeatures',dpi=300, bbox_inches=extent.expanded(1.5, 1.5))
plt.show()
In [74]:
xgb_classy_predictions = xgb_classy.predict(test_features) #redundant, but copied here too so I can see it
xgb_cm = confusion_matrix(test_labels, xgb_classy_predictions)
#print(cm) #this is the barebones confusion matrix

#all credit due to: Michael Galarnyk, "Logistic Regression using Python (scikit-learn)", Towards Data Science 
plt.figure(figsize=(9,9))
sns.heatmap(xgb_cm, annot=True, fmt=".0f", linewidths=.5, square = True, cmap = 'Blues_r');
plt.ylabel('Actual Label');
plt.xlabel('Predicted Label');
all_sample_title = 'XGBoost Regression Classification \n Accuracy Score: {0:.3f}'.format(\
                                                                xgb_classy.score(test_features, test_labels))
plt.title(all_sample_title, size = 15)
plt.savefig('exhibit_h_xgboost_regression_confusion_matrix',dpi=300, bbox_inches='tight')
print("Accuracy: "+str(accuracy_score(test_labels, xgb_classy_predictions))) #this is just a little check at the end
print("Precision: "+str(precision_score(test_labels, xgb_classy_predictions))) #this is just a little check at the end
print("Recall: "+str(recall_score(test_labels, xgb_classy_predictions))) #this is just a little check at the end
Accuracy: 0.7483443708609272
Precision: 0.7608695652173914
Recall: 0.813953488372093
In [76]:
#logistic regression
classification_features = labordata_processed.drop(['inlf'], axis = 1)
classification_outcome = labordata_processed.inlf
train_features, test_features, train_labels, test_labels = train_test_split(classification_features,\
                                                                            classification_outcome, test_size = 0.2)
#normalize this, since sklearn's logistic regression uses regularization
scaler = StandardScaler()
#"To determine the scaling factors and apply the scaling to the feature data:" -Codecademy
classification_train_features = scaler.fit_transform(train_features)
#"To apply the scaling to the test data:" -Codecademy
classification_test_features = scaler.transform(test_features) #we do NOT want to fit to the test

#run model
log_model = LogisticRegression(solver="liblinear") #to remove warning
log_model.fit(classification_train_features, train_labels)
print("_________Logistic Regression Classification_________")
print("")
print("Scored Against Itself")
print('Accuracy Score: {}'.format(round(log_model.score(classification_train_features, train_labels),3)))
print("")
print("Scored Against Test Data")
print('Accuracy Score: {}'.format(round(log_model.score(classification_test_features, test_labels),3)))
_________Logistic Regression Classification_________

Scored Against Itself
Accuracy Score: 0.771

Scored Against Test Data
Accuracy Score: 0.808
In [77]:
#cross-val
print("_________Cross-Validation Scoring for Logistic Regression Classification_________")
print('Accuracy: {}'.format(round(cross_val_score(log_model, classification_train_features, train_labels, \
                                                                cv=10, scoring='accuracy').mean(),3)))
print('Precision: {}'.format(round(cross_val_score(log_model, classification_train_features, train_labels, \
                                                                cv=10, scoring='precision').mean(),3)))
print('Recall: {}'.format(round(cross_val_score(log_model, classification_train_features, train_labels, \
                                                                cv=10, scoring='recall').mean(),3)))
_________Cross-Validation Scoring for Logistic Regression Classification_________
Accuracy: 0.746
Precision: 0.761
Recall: 0.803
In [78]:
log_regression_feature_list = []
log_regression_coef_list = []
odds_ratios = []
percent_change_in_odds = []
for i in classification_features.columns:
    log_regression_feature_list.append(i)
for i in range(len(log_model.coef_[0])):
    log_regression_coef_list.append(log_model.coef_[0][i])
for i in log_regression_coef_list:
    odds_ratios.append(np.exp(i))
for i in odds_ratios:
    percent_change_in_odds.append(round((i-1)*100,2))
In [79]:
top_10_log_reg_features = pd.DataFrame(sorted(list(zip(log_regression_feature_list,log_regression_coef_list, \
                                                       percent_change_in_odds))\
       ,key = lambda x: abs(x[1]),reverse=True)[:10], columns=['Feature', 'Logistic Regression Coefficient',\
                                                              'Percent Change in Odds of Being in Labor Force'])
top_10_log_reg_features
Out[79]:
Feature Logistic Regression Coefficient Percent Change in Odds of Being in Labor Force
0 mtr -1.249254 -71.33
1 huswage -1.224260 -70.60
2 hushrs -0.683773 -49.53
3 kidslt6 -0.652633 -47.93
4 age -0.561213 -42.95
5 exper 0.417370 51.80
6 educ 0.398756 49.00
7 expersq 0.364379 43.96
8 huseduc -0.223389 -20.02
9 motheduc 0.153784 16.62
In [80]:
log_model_predictions = log_model.predict(classification_test_features)
cm = confusion_matrix(test_labels, log_model_predictions)
#print(cm) #this is the barebones confusion matrix

#all credit due to: Michael Galarnyk, "Logistic Regression using Python (scikit-learn)", Towards Data Science 
plt.figure(figsize=(9,9))
sns.heatmap(cm, annot=True, fmt=".0f", linewidths=.5, square = True, cmap = 'Blues_r');
plt.ylabel('Actual Label');
plt.xlabel('Predicted Label');
all_sample_title = 'Logistic Regression Classification \n Accuracy Score: {0:.3f}'.format(\
                                                log_model.score(classification_test_features, test_labels))
plt.title(all_sample_title, size = 15)
plt.savefig('exhibit_g_logistic_regression_confusion_matrix',dpi=300, bbox_inches='tight')
print("Accuracy: "+str(accuracy_score(test_labels, log_model_predictions))) #this is just a little check at the end
print("Precision: "+str(precision_score(test_labels, log_model_predictions))) #this is just a little check at the end
print("Recall: "+str(recall_score(test_labels, log_model_predictions))) #this is just a little check at the end
Accuracy: 0.8079470198675497
Precision: 0.8152173913043478
Recall: 0.8620689655172413
In [82]:
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
In [83]:
#keras neural network regression
classification_features_nn = labordata_processed.drop(['inlf'], axis = 1)
classification_outcome_nn = labordata_processed.inlf
train_features_nn, test_features_nn, train_labels_nn, test_labels_mm = train_test_split(classification_features_nn,\
                                                                classification_outcome_nn, test_size = 0.2)
In [84]:
#normalize this, since sklearn's logistic regression uses regularization
scaler = StandardScaler()
#"To determine the scaling factors and apply the scaling to the feature data:" -Codecademy
classification_train_features_nn = scaler.fit_transform(train_features_nn)
#"To apply the scaling to the test data:" -Codecademy
classification_test_features_nn = scaler.transform(test_features_nn) #we do NOT want to fit to the test
In [86]:
def create_model(optimizer='rmsprop', init='uniform'):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=15, activation='relu'))
    model.add(Dense(15, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    #model.fit(train_features, train_labels, epochs=150, batch_size=10)
    return model
In [94]:
warnings.filterwarnings("ignore",category=DeprecationWarning)
# create model
model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
model.fit(train_features, train_labels, epochs=150, batch_size=10)
# evaluate using 10-fold cross validation
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(model, classification_train_features_nn, train_labels_nn, cv=kfold)
print(results.mean())
0.7112035788444553
In [95]:
print("_________Cross-Validation Scoring for Keras Neural Network Classification_________")
print('Accuracy: {}'.format(round(cross_val_score(model, classification_train_features_nn, train_labels_nn, \
                                                                cv=10, scoring='accuracy').mean(),3)))
print('Precision: {}'.format(round(cross_val_score(model, classification_train_features_nn, train_labels_nn, \
                                                                cv=10, scoring='precision').mean(),3)))
print('Recall: {}'.format(round(cross_val_score(model, classification_train_features_nn, train_labels_nn, \
                                                                cv=10, scoring='recall').mean(),3)))
_________Cross-Validation Scoring for Keras Neural Network Classification_________
Accuracy: 0.713
Precision: 0.737
Recall: 0.797
In [100]:
model.fit(classification_train_features_nn, train_labels_nn, epochs=150, batch_size=10)
Out[100]:
<keras.callbacks.History at 0x1c85b68630>
In [99]:
#keras
model_predictions_nn = model.predict(classification_test_features_nn)
cm = confusion_matrix(test_labels, model_predictions_nn)
#print(cm) #this is the barebones confusion matrix

#all credit due to: Michael Galarnyk, "Logistic Regression using Python (scikit-learn)", Towards Data Science 
plt.figure(figsize=(9,9))
sns.heatmap(cm, annot=True, fmt=".0f", linewidths=.5, square = True, cmap = 'Blues_r');
plt.ylabel('Actual Label');
plt.xlabel('Predicted Label');
all_sample_title = 'Keras Neural Netowrk Classification \n Accuracy Score: {0:.3f}'.format(\
                                                model.score(classification_test_features_nn, test_labels))
plt.title(all_sample_title, size = 15)
plt.savefig('neural_network_confusion_matrix',dpi=300, bbox_inches='tight')
print("Accuracy: "+str(accuracy_score(test_labels, model_predictions_nn))) #this is just a little check at the end
print("Precision: "+str(precision_score(test_labels, model_predictions_nn))) #this is just a little check at the end
print("Recall: "+str(recall_score(test_labels, model_predictions_nn))) #this is just a little check at the end
Accuracy: 0.41721854304635764
Precision: 0.49382716049382713
Recall: 0.45977011494252873

👆🏽Thoughts about this? 👆🏽

  • I believe this accuracy is so much lower than ususal just becasue of the test group selection.
  • The cross-validation accuracy is much higher (>70%)

👇🏾Thoughts about this? 👇🏾

  • The below block of code comes from Jason Brownlee.
In [98]:
warnings.filterwarnings("ignore",category=DeprecationWarning)
# grid search epochs, batch size and optimizer
optimizers = ['rmsprop', 'adam']
inits = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=inits)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(train_features, train_labels)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))
Best: 0.636213 using {'batch_size': 5, 'epochs': 50, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.566445 (0.052454) with: {'batch_size': 5, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.528239 (0.079805) with: {'batch_size': 5, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.471761 (0.079805) with: {'batch_size': 5, 'epochs': 50, 'init': 'normal', 'optimizer': 'rmsprop'}
0.528239 (0.079805) with: {'batch_size': 5, 'epochs': 50, 'init': 'normal', 'optimizer': 'adam'}
0.636213 (0.032536) with: {'batch_size': 5, 'epochs': 50, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.471761 (0.079805) with: {'batch_size': 5, 'epochs': 50, 'init': 'uniform', 'optimizer': 'adam'}
0.476744 (0.081397) with: {'batch_size': 5, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.558140 (0.085022) with: {'batch_size': 5, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.584718 (0.101911) with: {'batch_size': 5, 'epochs': 100, 'init': 'normal', 'optimizer': 'rmsprop'}
0.438538 (0.058214) with: {'batch_size': 5, 'epochs': 100, 'init': 'normal', 'optimizer': 'adam'}
0.471761 (0.079805) with: {'batch_size': 5, 'epochs': 100, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.498339 (0.042208) with: {'batch_size': 5, 'epochs': 100, 'init': 'uniform', 'optimizer': 'adam'}
0.471761 (0.079805) with: {'batch_size': 5, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.433555 (0.052454) with: {'batch_size': 5, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.574751 (0.093553) with: {'batch_size': 5, 'epochs': 150, 'init': 'normal', 'optimizer': 'rmsprop'}
0.581395 (0.063342) with: {'batch_size': 5, 'epochs': 150, 'init': 'normal', 'optimizer': 'adam'}
0.476744 (0.081397) with: {'batch_size': 5, 'epochs': 150, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.558140 (0.120625) with: {'batch_size': 5, 'epochs': 150, 'init': 'uniform', 'optimizer': 'adam'}
0.433555 (0.052454) with: {'batch_size': 10, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.528239 (0.079805) with: {'batch_size': 10, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.523256 (0.081397) with: {'batch_size': 10, 'epochs': 50, 'init': 'normal', 'optimizer': 'rmsprop'}
0.476744 (0.081397) with: {'batch_size': 10, 'epochs': 50, 'init': 'normal', 'optimizer': 'adam'}
0.438538 (0.058214) with: {'batch_size': 10, 'epochs': 50, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.433555 (0.052454) with: {'batch_size': 10, 'epochs': 50, 'init': 'uniform', 'optimizer': 'adam'}
0.566445 (0.052454) with: {'batch_size': 10, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.498339 (0.100967) with: {'batch_size': 10, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.523256 (0.081397) with: {'batch_size': 10, 'epochs': 100, 'init': 'normal', 'optimizer': 'rmsprop'}
0.476744 (0.081397) with: {'batch_size': 10, 'epochs': 100, 'init': 'normal', 'optimizer': 'adam'}
0.586379 (0.056357) with: {'batch_size': 10, 'epochs': 100, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.566445 (0.052454) with: {'batch_size': 10, 'epochs': 100, 'init': 'uniform', 'optimizer': 'adam'}
0.528239 (0.079805) with: {'batch_size': 10, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.529900 (0.078042) with: {'batch_size': 10, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.438538 (0.058214) with: {'batch_size': 10, 'epochs': 150, 'init': 'normal', 'optimizer': 'rmsprop'}
0.476744 (0.081397) with: {'batch_size': 10, 'epochs': 150, 'init': 'normal', 'optimizer': 'adam'}
0.438538 (0.058214) with: {'batch_size': 10, 'epochs': 150, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.476744 (0.081397) with: {'batch_size': 10, 'epochs': 150, 'init': 'uniform', 'optimizer': 'adam'}
0.549834 (0.109166) with: {'batch_size': 20, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.523256 (0.081397) with: {'batch_size': 20, 'epochs': 50, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.544851 (0.078925) with: {'batch_size': 20, 'epochs': 50, 'init': 'normal', 'optimizer': 'rmsprop'}
0.471761 (0.079805) with: {'batch_size': 20, 'epochs': 50, 'init': 'normal', 'optimizer': 'adam'}
0.433555 (0.052454) with: {'batch_size': 20, 'epochs': 50, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.471761 (0.079805) with: {'batch_size': 20, 'epochs': 50, 'init': 'uniform', 'optimizer': 'adam'}
0.534884 (0.078619) with: {'batch_size': 20, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.561462 (0.058214) with: {'batch_size': 20, 'epochs': 100, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.476744 (0.081397) with: {'batch_size': 20, 'epochs': 100, 'init': 'normal', 'optimizer': 'rmsprop'}
0.475083 (0.080802) with: {'batch_size': 20, 'epochs': 100, 'init': 'normal', 'optimizer': 'adam'}
0.578073 (0.083920) with: {'batch_size': 20, 'epochs': 100, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.566445 (0.052454) with: {'batch_size': 20, 'epochs': 100, 'init': 'uniform', 'optimizer': 'adam'}
0.436877 (0.053243) with: {'batch_size': 20, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'rmsprop'}
0.566445 (0.052454) with: {'batch_size': 20, 'epochs': 150, 'init': 'glorot_uniform', 'optimizer': 'adam'}
0.471761 (0.079805) with: {'batch_size': 20, 'epochs': 150, 'init': 'normal', 'optimizer': 'rmsprop'}
0.438538 (0.058214) with: {'batch_size': 20, 'epochs': 150, 'init': 'normal', 'optimizer': 'adam'}
0.566445 (0.052454) with: {'batch_size': 20, 'epochs': 150, 'init': 'uniform', 'optimizer': 'rmsprop'}
0.576412 (0.042117) with: {'batch_size': 20, 'epochs': 150, 'init': 'uniform', 'optimizer': 'adam'}

Best,

George John Jordan Thomas Aquinas Hayward, Optimist

george-hayward-data-scientist